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Abstract 

Collaborative representation based classification (CRC) has received much attention in the field of pattern recognition. CRC 
uses a simple way to code a testing sample as a linear combination of all the training samples and classifies the testing sample 
into the class with the minimum representation error. But the original algorithm of CRC suffers from the following problem. It 
must perform the matrix inverse operation, which may cause unstable numerical computation. With this paper, in order to 
overcome the above problem of the original algorithm of CRC, we propose an iterative collaborative representation based 
classification (ICRC) algorithm. The experimental results on face recognition show that ICRC not only outperforms CRC but 
also is able to obtain a much higher accuracy than LRC. 
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Introduction 

Face recognition has received much attention in the last few decades [1,2,3]. Though many methods have been 
proposed in the past, face recognition is still faced with great challenges, which motivates researchers to devote 
their efforts to this field [4-8]. 

Feature extraction and classification are two important aspects of face recognition. Most of previous feature 
extraction methods could be categorized into two categories namely unsupervised and supervised methods. 
Principle component analysis (PCA) [9] and independent component analysis (ICA) [10] are famous unsupervised 
feature extraction methods. Typical examples of supervised feature extraction methods include linear 
discrimination analysis (LDA) [11], uncorrelated linear analysis (ULDA) [12] and so on. 

Many classification methods can be used in face recognition [13, 14, 15]. The nearest neighbor classifier may be the 
most widely used method owing to its simplicity and effectiveness. Many variants of the nearest neighbor classifier 
including the local mean based nearest neighbor classifier [16], adaptive nearest neighbor classifier [17], K nearest 
neighbor classifier [18] and center-based nearest neighbor classifier [19] have also been used in face recognition. 

Recently, linear regression classification (LRC) [20] has been proposed for face recognition and obtained promising 
performance. LRC is based on the assumption that face images from the same class lie on a linear subspace. It 
seems that LRC is directly related with the nearest space method. LRC can be easily implemented by solving linear 
equations. Wright et al. proposed a sparse representation based classification (SRC) [21] and applied it for face 
recognition successfully. SRC codes a testing sample by sparse linear combination of all training samples and then 
classifies the testing sample into the class with the minimum representation error. SRC is different from LRC as 
follows. First, SRC exploits a sparse linear combination of all training samples to represent the testing sample 
whereas LRC just uses a linear combination of the training samples of a class to express the testing sample. Second, 
SRC is based on the i x regularization but LRC is not. The / , regularization means that SRC should seek its 
solution under the condition that the t x norm of the solution vector is the minimum. This implies that SRC does 
not have a simple closed-form solution though what it addresses is a convex problem. Actually, SRC usually 
designs an iterative algorithm for calculating its solution. Differing form SRC, LRC has no similar constraint when 
it seeks its solution. For this reason, we can refer to SRC and LRC as t x regularization based representation 
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classification method and simple representation classification method, respectively. It is clear CRC is also a simple 
representation classification method. 

Before CRC is proposed, the i x regularization is taken as the most significant core of SRC. However, Zhang et al. 
analyzed SRC deeply and pointed out that collaborative representation was more important than the 
regularization [22]. So they proposed collaborative representation based classification (CRC). Extensive 
experiments demonstrated the notable performance of CRC. Variants of CRC such as the method in [23] are also 
proposed for face recognition. The method in [23] can be viewed as an improvement to CRC. Because it exploits 
only the training samples of a small number of classes to represent the testing sample, superficially it obtains 
sparse representation of the testing sample via training samples. 

CRC must calculate the matrix inverse operation, which may cause unstable numerical computation. Also, face 
recognition is the typical small size sample problem [24]. The dimension of the training sample is much more than 
the number of the training samples. So the testing sample can not be accurately represented by the training 
samples. These are the major factors which affects the performance of CRC. 

In this paper, we propose an iterative algorithm for CRC. In this algorithm, the matrix inverse operation is 
avoided, so the numerical stability caused by the matrix inverse operation can be eliminated. The rest of the paper 
is organized as follows: in Section 2, we briefly review LRC and CRC algorithms. In Section 3 iterative CRC is 
described. In Section 4, several experiments are present to demonstrate the improvement of proposed algorithm. 
Conclusions are summarized in Section 5. 


Related works 

Linear Regression Classification 

Linear regression classification is based the assumption that a specific class lie on a linear subspace. Under this 
assumption, the testing sample is represented as a linear combination of class-specific training samples by using 
the least square method. Finally, the testing sample is sighed as the class with the minimum reconstruction error. 

Assume we have c different individuals. Let x, denotes the i' h ( i e {1, 2, • • • , n p } ) face images of the p"' subject, where 
d is the dimensionality of the face image vectors and n is the number for the p' h subject. X p is a class-specific 
model generated by stacking the n-dimensional image vectors, X = [xf ,x% ] e R dx " r . 

Given a testing sample y e R d , LRC solves the following linear regression problem to get the representation 
coefficients a p e R" r ' 1 : 

a p =^\y-X p a p \ 2 Vpe( 1, 2, - - • , c) (1) 

where ||-|| stands for the ^norm. 

a can be estimated by suing least-squares estimation: 

a p ={X T p X p )- l X T p y (2) 

The testing sample can be reconstructed from Eq.(3) 

y p = X p a p = X p (X T p X p y l X T p y (3) 

The distance measure between the testing sample y and reconstructed sample y p (p = l,2,---,c ) can be calculate 
from Eq.(4) 

dp(.y) = \y - y p \ 1 = 1,2,—, c (4) 

The identity of testing sample y is retrieved by the minimum reconstruction error as follows: 
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identity (y) = argmin dp(y) 

p 


( 5 ) 


Collaborative Representation Classification 

Zhang et. al in reference [22] pointed that collaborative representation was the real power of SRC. As a typical 
small-sample-size problem, X p is always not enough to represent the testing sample y well even when y is from 

class p . So collaborative representation classification is proposed, which seeks to represent y collaborative by 

using all the training samples from different classes. Assume X = \X 1 ,X 2 ,---X P ,---,X C ] , X p e R dxn " , n is the 

number for the p' h subject. A includes all the training from all the classes. 'P =[a l ,a 2 ,---,a k , (a k e/i 1 ) is 
the coefficient of X . So the coefficient vector 'P can be estimated by: 

'P = {X T X + AI)~ x X T y (6) 

where A is a positive small constant and we set A = 0.01 in this paper. Eq.6 shows that CRC uses all the training 
samples to represent testing samples. The representation residual e p can be calculated as follows: 

e k=\\y- a k x k\\ 2 k = \,2,-, C ( 7 ) 

The rule in favor of the class with minimum distance can be calculated by: 

identity(y) = arg min e k (8) 

k 


Iterative CRC 

In this section, we introduce the iterative CRC algorithm for face recognition. From the description in section 2, we 
can find some similarities of LRC and CRC. Firstly, LRC and SRC all play a role as a f 2 norm constraint on 
represented coefficients. Secondly, the classification criteria of LRC and CRC are based on the minimum 
representation residual. The difference between LRC and CRC only lies on the collaborative representation. In 
LRC, a testing sample is represented by training samples from a specific class. But in the original CRC algorithm, a 
testing sample is represented by all the training samples from all the classes. In face recognition, the number of 
training samples is far less than the dimension of the training samples. Thus, both LRC and CRC usually suffer 
from the problem that the testing sample can not entirely be represented by linear combination of the used training 
samples. Moreover, LRC and CRC have to perform the matrix inverse operation. It is well known that inverse 
operation may cause unstable numerical computation. 

The iterative method has been introduced to the field of sparse representation and promising results can be 
obtained. For example, Michal [25] proposed an iterative method, which is called K-SVD, to generate sparse coding 
of samples based on the dictionary and to update the dictionary atoms. Xu et. al proposed a two-phase test sample 
sparse representation method for face recognition [26]. In order to avoid the matrix inverse operation in CRC and 
to lead to more numerically stable solution, we proposed iterative CRC. Based on the idea to minimize the 
representation residual and the gradient descent algorithm, we know that the formula to iteratively calculate the 
coefficient of CRC is as follows. 

First we use the Eq.(6) to calculate the coefficient 'F as the original coefficient of ICRC. The error vector 
reconstructive representation can be calculated as follows: 

e = y-y/X (9) 


Then we can calculate coefficient variation according to the error vector e : 

Ah' = X T e 


At last we can update the coefficient: 


'F = X F + TA'F 


( 10 ) 

( 11 ) 
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Where A is a positive scalar number that balances the learning efficiency and convergence. In our experiments, we 
set A = y'. , which i is the number of iterations. The learning efficiency of ICRC will change with the increase of 
iteration number. So ICRC has better convergence properties and is numerically stable. 

As we know, the solution obtained using the iterative method is influenced by the initialization step. Moreover, 
though the solution obtained using the original CRC algorithm has its drawback, it is still somewhat reasonable. 
Thus, our iterative method takes the solution of the original CRC algorithm as the initialization to iteratively 
calculate the ultimate solution. As our iterative method starts from an acceptable initialization, it is easier to reach 
the optimal solution. 

The iterative algorithm can be described as follows: 

Task: Renew the Coefficient y/ 
i// 0 =y/ ( y/ 0 is the original coefficient) 

for i=l iteration 
if i>l 

y = y/xX (Reconstruct the testing sample) 

error = v-y / norm(y) (Calculate the reconstructive error. norm( ) is the normalize function) 

y = y 

else 

[col, row] = si:e{X) (Get the column and row of X) 

error = zeors (col x row, 1) (If i=l, i// renews nothing) 
end 

y/ =y/ + X'x error / i (Renew the coefficient) 

y/ = y/ / norm{y/) (Normalize the new coefficient) 

If norm(y/ a -y/)< 0.005 

Break 

end 

¥o=¥ 

end 


FIG.l THE CORE OF OPERATIONS OF ITERATIVE CRC ALGORITHM 

From Fig. 1 we can see that we do not calculate the inverse matrix to update the coefficients. So ICRC can reduce 
the possibility of getting the unstable numerical computation. Normalization of y and <// are the important steps 
of ICRC. These steps ensure that the algorithm gets the reasonable error. In other word, the normalization steps 
make y and y in error = y - y / nrom(y) are almost in the same range of -norm. When / > 1 the algorithm 
calculates the reconstructive error first, and then projects the training samples on the reconstructive error to get the 
temp coefficient. At last the algorithm updates coefficients to complete an iterative process. Due to the 
normalization, the reconstructive error does not reduce with the increase of iteration. So in the update process, the 
algorithm adds the weighted coefficient according to the iteration. When i = 1 , the coefficients will update nothing. 
At this time ICRC equates to CRC. 

The distance measure and the class identity can be gotten by Eq.7 and Eq.8 separately. 

Experiments and Discussion 

In this section, ICRC is evaluated on the two publicly available face datasets, AR and FERET face dataset. All face 
images from the two databases are properly aligned and cropped with only facial region retained. In order to 
illustrate the effectiveness of ICRC, we also compare with two state-of-arts: LRC and CRC. 

Experiments on AR Database 

The AR [27] dataset consists of over 4000 frontal images for 126 subjects. In our experiments, we choose a subset 
including 2400 non-occluded frontal views from 120 subjects. All the 20 images of a sample gallery subject are 
illustrated in Fig. 2. 
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FIG. 2 SAMPLE IMAGES OF ONE SUBJECT IN AR DATASET. 
TABLE 1 THE RECOGNITION RATE OF COMPETING METHODS ON AR DATABASE. 


Method 

P = 2 

P = 3 

T3 

II 

P = 5 

LRC 

0.5924 

0.5909 

0.5811 

0.5956 

CRC 

0.7708 

0.7642 

0.7594 

0.7667 

ICRC 

0.7931 

0.7975 

0.7859 

0.7800 


The face accurate recognition rate of different 
iteration 



1 4 7 10 13 16 19 22 25 28 31 34 37 40 

number of iteration 


The face arrrurate recogniton rate of 
different iteration 



10 13 16 19 22 25 28 31 
number of iteration 


34 37 40 


FIG. 3 THE RECOGNITION RATE OF DIFFERENT ITERATION 
(FROM 1 TO 40). THE TRAINING SET INCLUDES THE FIRST 4 
FACE IMAGES OF EVERY CLASS ON AR FACE DATABASE. 


FIG. 4 THE RECOGNITION RATE OF DIFFERENT ITERATION 
(FROM 1 TO 40). THE TRAINING SET INCLUDES THE FIRST 5 
FACE IMAGES OF EVERY CLASS ON AR FACE DATABASE. 


In this experiment, we use the first p images of each class to form the training set, and the remained images of each 


class to form the validating set. The number of iteration is set as 20. The accuracy recognition rates are show in 
Tbale 1. 


We can see that CRC and ICRC are both at least 30% higher than LRC. The role of collaborative representation in 
the face recognition is clearly immense. As the improvement of CRC, the recognition rate of ICRC is higher than 
that of CRC at all events. In the best cases, the recognition rate of ICRC can be improved over 4%. 

In order to observe and learn the relationship between different number of iteration and the accurate recognition 
rate, two experiments has been done and the results are shown as follows. 

We take the first 4 and 5 face images of every class as the training set and take the others as testing set. The number 
of iteration is from 1 to 40, Fig 3 and Fig. 4 show the results of the two experiments. 

Fig.4 and Fig.5 show that the accurate rate will reduce with the increase in the number of iterations when the 
number of iteration less than 4. Then the accurate recognition rate will increase along with the increase of iteration. 


Experiments on ORL database 

The ORL database contains 40 persons, each person including 10 different images [28]. The images of the same 
person are taken at different times, under slightly varying lighting conditions and with various facial expressions. 
Some persons are captured with or without glasses. The heads in images are slightly titled or rotated. The images 
in the database are manually cropped. Fig. 4 shows ten images of a sample gallery subject. 

In this experiment we take the last p images of each class as the training samples and the remained images for each 
person as the testing samples. The accuracy recognition rates are show in Table 2. 

From the results of Table 2, we can see that the advantages of CRC can not be shown. When p = 2 and p = 4 the 
recognition rate of CRC is even less than that of LRC. In other cases, the accuracy of SRC and LRC are the same. 
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But the recognition rate of ICRC is higher than that of LRC and CRC except in the case of p = 2 . 


From Fig. 4 and Fig. 5 we can learn that experiments on ORL face database have the similar law with experiments 
on AR face database. But ICRC on ORL face database, the accuracy recognition rate stabilizes rapidly and gets the 
top rate. Compared with ORL face database, AR face database have more large number of classes and samples. 
This may be the major reason why the results of Fig.2 and Fig.3 are less stable than that of ORL face database. 
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FIG. 5 SAMPLE IMAGES OF ONE SUBJECT IN THE ORL DATASET. 
TABLE 2 THE RECOGNITION RATE OF COMPETING METHODS ON ORL DATABASE. 


Method 

P= 2 

P= 3 

*ts 

II 

4* 

p=5 

LRC 

0.7750 

0.8571 

0.9292 

0.9300 

CRC 

0.7656 

0.8571 

0.9250 

0.9400 

ICRC 

0.7656 

0.8929 

0.9458 

0.9500 


The face accurate recognition rate of 
different iteration 



1 4 7 10 13 16 19 22 25 28 31 34 37 40 

number of iteration 


FIG. 4 THE RECOGNITION RATE OF DIFFERENT ITERATION 
(FROM 1 TO 40). THE TRAINING SET INCLUDES THE LAST 4 
FACE IMAGES OF EVERY CLASS ON ORL FACE DATABASE. 


The face accuratge recognition rate of 
different iteration 



1 4 7 10 13 16 19 22 25 28 31 34 37 40 


number of iteration 


FIG. 5 THE RECOGNITION RATE OF DIFFERENT ITERATION 
(FROM 1 TO 40). THE TRAINING SET INCLUDES THE LAST 5 
FACE IMAGES OF EVERY CLASS ON ORL FACE DATABASE. 


Conclusions 

In this paper we propose ICRC for face recognition. The experimental results show that ICRC obtains more 
accurate recognition results than CRC and LRC in face recognition. This shows that in ICRC the increase of the 
numerical stability also contributes to the improvement of recognition results. In addition, it seems that the 
performance of ICRC may be further improved and we will study it in the future. 
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